Reader-Aware Multi-Document Summarization via Sparse Coding

نویسندگان

  • Piji Li
  • Lidong Bing
  • Wai Lam
  • Hang Li
  • Yi Liao
چکیده

We propose a new MDS paradigm called readeraware multi-document summarization (RA-MDS). Specifically, a set of reader comments associated with the news reports are also collected. The generated summaries from the reports for the event should be salient according to not only the reports but also the reader comments. To tackle this RAMDS problem, we propose a sparse-coding-based method that is able to calculate the salience of the text units by jointly considering news reports and reader comments. Another reader-aware characteristic of our framework is to improve linguistic quality via entity rewriting. The rewriting consideration is jointly assessed together with other summarization requirements under a unified optimization model. To support the generation of compressive summaries via optimization, we explore a finer syntactic unit, namely, noun/verb phrase. In this work, we also generate a data set for conducting RA-MDS. Extensive experiments on this data set and some classical data sets demonstrate the effectiveness of our proposed approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Document Summarization Based on Two-Level Sparse Representation Model

Multi-document summarization is of great value to many real world applications since it can help people get the main ideas within a short time. In this paper, we tackle the problem of extracting summary sentences from multi-document sets by applying sparse coding techniques and present a novel framework to this challenging problem. Based on the data reconstruction and sentence denoising assumpt...

متن کامل

Reader-Aware Multi-Document Summarization: An Enhanced Model and The First Dataset

We investigate the problem of readeraware multi-document summarization (RA-MDS) and introduce a new dataset for this problem. To tackle RA-MDS, we extend a variational auto-encodes (VAEs) based MDS framework by jointly considering news documents and reader comments. To conduct evaluation for summarization performance, we prepare a new dataset. We describe the methods for data collection, aspect...

متن کامل

Query-Aware Sparse Coding for Multi-Video Summarization

Given the explosive growth of online videos, it is becoming increasingly important to relieve the tedious work of browsing and managing the video content of interest. Video summarization aims at providing such a technique by transforming one or multiple videos into a compact one. However, conventional multi-video summarization methods often fail to produce satisfying results as they ignore the ...

متن کامل

Multi-candidate reduction: Sentence compression as a tool for document summarization tasks

This article examines the application of two single-document sentence compression techniques to the problem of multi-document summarization—a “parse-and-trim” approach and a statistical noisy-channel approach. We introduce the Multi-Candidate Reduction (MCR) framework for multi-document summarization, in which many compressed candidates are generated for each source sentence. These candidates a...

متن کامل

Graph-based models for multi-document summarization

University of Ljubljana Faculty of Computer and Information Science Ercan Canhasi Graph-based models for multi-document summarization is thesis is about automatic document summarization, with experimental results on general, query, update and comparative multi-document summarization (MDS). We describe prior work and our own improvements on some important aspects of a summarization system, incl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015